The Design and Implementation of FFTW 3 Matteo

نویسنده

  • Steven G. Johnson
چکیده

FFTW is an implementation of the discrete Fourier transform (DFT) that adapts to the hardware in order to maximize performance. This paper shows that such an approach can yield an implementation that is competitive with handoptimized libraries, and describes the software structure that makes our current FFTW3 version flexible and adaptive. We further discuss a new algorithm for real-data DFTs of prime size, a new way of implementing DFTs by means of machine-specific “SIMD” instructions, and how a special-purpose compiler can derive optimized implementations of the discrete cosine and sine transforms automatically from a DFT algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Implementing FFTs in Practice ∗

Discussion of the considerations involved in high-performance FFT implementations, which center largely on memory access and other non-arithmetic concerns, as illustrated by a case study of the FFTW

متن کامل

Architecture independent short vector FFTs

This paper introduces a SIMD vectorization for FFTW—the “fastest Fourier transform in the west” by Matteo Frigo and Steven Johnson. The new method leads to an architecture independent short vector SIMD FFT vectorization that utilizes the architecture adaptivity of FFTW. It is based on special FFT kernels (up to size 64 and more) that are utilized by FFTW to compute the whole transform. This vec...

متن کامل

Implementing FFTs in Practice ∗ Steven

Discussion of the considerations involved in high-performance FFT implementations, which center largely on memory access and other non-arithmetic concerns, as illustrated by a case study of the FFTW

متن کامل

The Fastest Fourier Transform in the West (MIT-LCS-TR-728)

This paper describes FFTW, a portable C package for computing the oneand multidimensional complex discrete Fourier transform (DFT). FFTW is typically faster than all other publicly available DFT software, including the well-known FFTPACK and the code from Numerical Recipes. More interestingly, FFTW is competitive with or better than proprietary, highly-tuned codes such as Sun’s Performance Libr...

متن کامل

FFTW: an adaptive software architecture for the FFT

FFT literature has been mostly concerned with minimizing the number of floating-point operations performed by an algorithm. Unfortunately, on present-day microprocessors this measure is far less important than it used to be, and interactions with the processor pipeline and the memory hierarchy have a larger impact on performance. Consequently, one must know the details of a computer architectur...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005